The RBSE Spider - Balancing Effective Search Against Web Load1

نویسنده

  • David Eichmann
چکیده

The design of a Web spider entails many things, including a concern for reasonable behavior, as well as more technical concerns. The RBSE Spider is a mechanism for exploring World Wide Web structure and indexing useful material thereby discovered. We relate our experience in constructing and operating this spider.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The RBSE Spider - Balancing Effective Search Against Web Load

The design of a Web spider entails many things, including a concern for reasonable behavior, as well as more technical concerns. The RBSE Spider is a mechanism for exploring World Wide Web structure and indexing useful material thereby discovered. We relate our experience in constructing and operating this spider.

متن کامل

Data Partition and Job Scheduling Technique for Balancing Load in Cluster of Web Spiders

Search engines primary rely on web spiders to collect large amount of data for indexing and analysis. Data collection can be performed by several agents of web spiders running in parallel or distributed manner over a cluster of workstations. This parallelization is often necessary in order to cope with a large number of pages in a reasonable amount of time. However, while keeping a good paralle...

متن کامل

Load Balancing Approaches for Web Servers: A Survey of Recent Trends

Numerous works has been done for load balancing of web servers in grid environment. Reason behinds popularity of grid environment is to allow accessing distributed resources which are located at remote locations. For effective utilization, load must be balanced among all resources. Importance of load balancing is discussed by distinguishing the system between without load balancing and with loa...

متن کامل

Web-Collaborative Filtering: Recommending Music by Spidering The Web

We show that it is possible to collect data that is useful for collaborative filtering (CF) using an autonomous Web spider. In CF, entities are recommended to a new user based on the stated preferences of other, similar users. We describe a CF spider that collects from the Web lists of semantically related entities. These lists can then be used by existing CF algorithms by encoding them as "pse...

متن کامل

Parallel Web Spiders for Cooperative Information Gathering

Web spider is a widely used approach to obtain information for search engines. As the size of the Web grows, it becomes a natural choice to parallelize the spider’s crawling process. This paper presents a parallel web spider model based on multi-agent system for cooperative information gathering. It uses the dynamic assignment mechanism to wipe off redundant web pages caused by parallelization....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994